world model
The Download: DeepSeek's latest AI breakthrough, and the race to build world models
The Download: DeepSeek's latest AI breakthrough, and the race to build world models Plus: China has blocked Meta's $2 billion acquisition of AI startup Manus. On Friday, Chinese AI firm DeepSeek released a preview of V4, its long-awaited new flagship model. Notably, the model can process much longer prompts than its last generation, thanks to a new design that handles large amounts of text more efficiently. While the model remains open source, its performance matches leading closed-source rivals from Anthropic, OpenAI, and Google. Here are three ways V4 could shake up AI . AI systems have already gained impressive mastery over the digital world, but the physical world remains humanity's domain.
- Government > Regional Government > North America Government > United States Government (0.71)
- Energy (0.71)
Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training
Local prediction-error-based curiosity rewards focus on the current transition without considering the world model's cumulative prediction error across all visited transitions. We introduce Curiosity-Critic, which grounds its intrinsic reward in the improvement of this cumulative objective, and show that it reduces to a tractable per-step form: the difference between the current prediction error and the asymptotic error baseline of the current state transition. We estimate this baseline online with a learned critic co-trained alongside the world model; regressing a single scalar, the critic converges well before the world model saturates, redirecting exploration toward learnable transitions without oracle knowledge of the noise floor. The reward is higher for learnable transitions and collapses toward the baseline for stochastic ones, effectively separating epistemic (reducible) from aleatoric (irreducible) prediction error online. Prior prediction-error curiosity formulations, from Schmidhuber (1991) to learned-feature-space variants, emerge as special cases corresponding to specific approximations of this baseline. Experiments on a stochastic grid world show that Curiosity-Critic outperforms prediction-error and visitation-count baselines in convergence speed and final world model accuracy.
Recurrent World Models Facilitate Policy Evolution
A generative recurrent neural network is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations. The world model's extracted features are fed into compact and simple policies trained by evolution, achieving state of the art results in various environments. We also train our agent entirely inside of an environment generated by its own internal world model, and transfer this policy back into the actual environment. Interactive version of this paper is available at https://worldmodels.github.io
The Download: Pokémon Go to train world models, and the US-China race to find aliens
Plus: AI fakes of the Iran war are flooding X--and Grok is failing to flag them. Pokémon Go was the world's first augmented-reality megahit. Released in 2016 by Niantic, the AR twist on the juggernaut Pokémon franchise fast became a global phenomenon. "500 million people installed that app in 60 days," says Brian McClendon, CTO at Niantic Spatial, an AI company that Niantic spun out last year. Now Niantic Spatial is using that vast trove of crowdsourced data to build a kind of world model--a buzzy new technology that grounds the smarts of LLMs in real environments. The firm wants to use it to help robots navigate more precisely.
- Asia > China (0.43)
- Asia > Middle East > Iran (0.26)
- North America > United States > New York (0.05)
- (6 more...)
- Leisure & Entertainment > Games > Computer Games (0.93)
- Information Technology (0.72)
- Government > Regional Government > North America Government > United States Government (0.50)
- Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
- North America > Canada (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California (0.04)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine (0.67)
- Information Technology (0.46)
- Transportation (0.46)
- (2 more...)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > Middle East > Israel (0.04)
- Leisure & Entertainment (0.47)
- Education (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.99)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)